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^ : Abstract 

1^ i In order to estimate the theoretical auto-correlation function of a time series 

I from the sample auto- correlation function of one of its realisations, it is usually 

^ ■ assumed without justification that the time series is ergodic. In 1943, Khintchine 

^ ■ made some visionary conjectures about dynamical systems with large numbers 

of degrees of freedom which would justify, even in the absence of ergodicity, 

P_il approximately the same conclusions. We prove Khintchine's conjectures in some 

I special cases of a linearly coupled assembly of harmonic oscillators. 

» . 
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: RESUMEN 

i/-^ I Para emplear el correlograma de los valores muestrales de un proceso es- 

^ I tocastico para estimar su funcion teorica de autocorrelacion, por regla general se 

asume, sin justificacion, que el proceso es ergodico. Pero en 1943, Khintchine 

■ conjeturo proposiciones de gran importancia en este asunto, que justificarian 
^ ! una aproximacion a las mismas estimaciones aiin sin la ergodicidad del sistema. 

I Mostraremos casos particulares de las conjeturas de Khintchine para asambleas 

' de osciladores lineales. 

^■ 

5t , Preface 

A novel way to justify the use, in Statistical Mechanics, of the equality of time averages 

with phase averages was envisioned by Khintchine in 1943, but he could only prove special 

cases [9], [10]. He suggested that for quite general, non-ergodic dynamical systems, a kind 

of approximate ergodicity for a restricted class of observables should arise when the number 

of degrees of freedom is sufficiently large. ([9], pp. 62-63). 

In the study of Brownian motion by Ford-Kac-Mazur in 1965 [2] in terms of a Hamil- 
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tonian heat bath they carried out the Gibbs program in detail for a concrete, hnear, 
Hamiltonian model. 

The task of this paper is to recast the model of Brownian motion to show that one may- 
assume a determined initial condition and derive a stochastic process in the thermodynamic 
limit without assuming any initial probability distribution. It extends Khintchine's vision 
to an essentially new case (earlier extensions by Ruelle [13] and Lanford [8] assumed weak, 
short-range interactions) . It is already known that the results of [2] remain true for many 

different choices of an initial distribution (see Kim [11] for a survey). 

The systems we study are linear Hamiltonian systems and are very far from being 

ergodic. But as Khintchine foresaw, a kind of approximate ergodicity holds good for some 
measurable functions (having particular physical significance) when the number of degrees 
of freedom is very large. In our case, the measurable functions we study are the auto- 
correlations of the time-evolution of the momentum coordinate of one of the particles. 

This paper falls into two halves which are almost separate: the first half is a survey 
of the problem and does not pretend to any originality, given at a seminar at the Univ. of 
Havana in 2011. The second half presents the technical details which were left out of the 
seminar and are original. 

Introduction to the Role of Ergodicity in the Theory of Time Series 

Time Series: Two Contradictory Definitions 

The notion of time series has two definitions which although related cause confusion 

to the student. The first meaning is that a time series is a series of data distributed in 

time. If, for example, M is a dynamical system with Hamiltonian H(p, q) — |(|pp + |qP), 

where p and q are n+ 1-dimensional vectors, i.e., a collection of n + 1 uncoupled harmonic 
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oscillators, then since P = and Q= p{t) is a time series if p{0) and q{0) are given. 
Another example is the (infinite) sequence 



of tosses of a fair coin (there would not be any contradiction of either the laws of Physics 

or the laws of Probability if all future tosses resulted in 1). These are both examples of 

deterministic data since even the coin toss is a function of time in the usual, deterministic, 

sense of the word function. 

The second meaning of time series, is that it is a sequence of random variables. This 

sense is also called a stochastic process. Two examples which are related to the previous 

examples are: 

Coin Toss: a sequence of independent, identically distributed random variables taking 
the values 1 and -1 with equal probabilities. In fact, the space of all possible sequences of 
results of a coin toss, i.e., the space of all possible sequences of binary digits, can be mapped 
to the unit interval [0, 1] C M by regarding each sequence Xn as the binary expansion of 
the real number 



This map is an equivalence of probability spaces (it is one to one except on a set of 
measure zero), between the space of all possible sequences of tosses and the unit interval 
with Lesbegue measure. 

Dynamical System: put some (any) probability distribution on the set of initial conditions 
{p(0),g(0)} = M^("+^), for example, the Maxwell-Boltzmann distribution 
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where k is Boltzmann's constant and T is the absolute temperature in Kelvin. (For later 

generalisations, we here allow the possibility of a linear coupling given by a matrix A.) 

Then p(0) is now a random variable and so is p{t) for any t, so {Po{'t)}t is a continuous 

series of random variables and hence a time series in the sense of a stochastic process. 
From the standpoint of the rest of statistics, time series are odd and difficult because 

we have to regard the time series in the second sense as the population and the time series 

in the first sense as one sample taken from the population. For example, if the probability 

space of the random variables from the example of the coin toss is taken to be the unit 

interval [0, 1] with Lebesgue measure dx, then for any fixed a e [0, 1] we get a time series 

of data, that is, a time series in the first sense, given by Xn{a). (Every sample point has 

probability zero, so this is an example of a probability space where probability zero does 

not mean 'impossible'.) Unlike the examples of data in first- year statistics courses, we can 

never draw more than one sample point since, e.g., we cannot go back to the year 2000 

and 'try again'. 

Other terms used are, e.g., that {Po{t)}t is an ensemble of time series (in the first 
sense), and that given a particular value (p(0),g(0)) for the initial conditions, then po is 
a well-defined function of t called a realisation of the time series. These two senses are 
intimately related, the confusion in terminology serves a useful purpose, and it is not going 
to be reformed any time soon. 

Statistics of Time Series. 

The usual descriptive statistics from first-year statistics courses are less useful here. 
The average is misleading if the time series possess a trend. Trends and cycles are more 
important than the average or dispersion. The most important descriptive statistic of a 
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time series is its auto-correlation function or correlogram. (I wish to emphasise here that 
it is a descriptive statistic: it has no more probabilistic significance than did the average 
or standard deviation.) Given a series of data f{t) it measures the average influence of 
f{t) on f{t + r) and is given by 



This is called the sample auto-correlation function when it is necessary to distinguish it 
from a related notion which does not use the time average but uses the whole theoretical 
model (population) instead of only one realisation of the process (the data), the phase or 
population or model or 'theoretical' auto-correlation function: 



Here, E, the expectation, is taken over the probability space, which is sometimes the phase 

space of a dynamical system, so it can be regarded as a phase average as opposed to the 

sample auto-correlation function, which was a time average of actual given data. (For 

simplicity we assume, from now on, that all random variables are centred, i.e., have zero 

expectation.) 
Examples. 

1. Given a data stream p{t) = X] cos nt + '^bn sin nt, we obtain 



which is an even function. 

2. If the process X{t) = p{t) 'with random phases', i.e., if each p{t) is turned into a 
random variable by introducing random phase shifts in its terms, then R{t) is the same 
thing. (The phases can be uniformly distributed or Gaussian, it makes no difference.) 




R{t) = E{XnXn+r). 
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3. Coin toss: 

E{XnXm) = if n m 
by independence. R{t) = S{t), a spike. 

4. Data from a coin toss: (paiT) could be any positive definite function (that takes the 
value 1 at the origin). For example, one realisation could have a periodic correlogram, a 
sawtooth alternating between 1 and -1, and another set of tosses (all heads) could yield a 
constant function. Neither of these is very close to the theoretical auto-correlation function 
calculated above, but a 'normal' realisation will have a correlogram close to a spike. 

Since we only have one sample point from the population, the problem is how to infer 
R{t) from (fij)! This is the topic of this paper. The answer has usually been taken to 
be the concept of ergodicity, a concept borrowed from Statistical Mechanics to which we 
will turn in the next section. The reader should be warned that within the discipline of 
time series, the term 'large sample theory' has been perverted from its meaning in the rest 
of statistics, since here its original meaning is largely irrelevant (and we will not use it in 
this lecture). Within advanced time series texts, it means the theory of one sample point 
which has a lot of data in it. (See the careful discussion in Fuller [3], pp. 308ff.) 
Levy's philosophy. 

Levy pioneered the method of replacing the study of a stochastic process by a study of 
its auto-correlation function (sometimes called the auto-covariance function or sometimes 
normalised in a certain fashion). His philosophy [12] was that for a wide class of stochastic 
processes, all important properties can be seen in the theoretical auto- correlation function 
of the process. 
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Statistical Mechanics 

If M is a dynamical system, as above, and one supposes that even non-engineering data 
sets such as those of climate change or coin tosses are indeed the results of an immensely 
complicated dynamical system with an astronomical number of degrees of freedom, then a 
measurable function / on M is called an observable. But in fact / itself is not observable. 
A measurement of / is always macroscopic, it is always the result of letting some part of 
the system come into contact with a measurement apparatus, such as a thermometer, and 
it takes time for the apparatus, which is of macroscopic dimensions, to react to the system 
and reach an equilibrium state. No state which changes rapidly, at a molecular scale, can 
be observed by the human eye, so we always model a measurement as an infinite time 
average and define the following notation: 



The point is that (p(0),q(0 )) itself is unknown and uncontrollable. Hence time averages 
are impossible to calculate, and yet they are what can be measured scientifically. 

On the other hand, if we take dpdq to be Liouville measure on M, then phase averages 
(we introduce two difi^erent notations for this same concept in the following equation) 



can be calculated, at least approximately. Here, f2 is a compact surface of constant energy 
within M and the measure is the appropriate invariant measure inherited from Liouville 
measure. 

A dynamical system is said to be ergodic if for all measurable /, we have 





{f)t = if) 
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for almost all initial conditions (p(0), q{0)) (note that the left hand side depends implicitly 
on a choice of initial conditions but the right hand side depends only on /). 

The importance of ergodicity is that if a dynamical system is ergodic, then macroscopic 
measurements, the only ones we can make, are reliable guides to the phase averages, the 
only quantities we can really calculate. Without ergodicity, there is no way to connect 
theory with experiment. 

If a time series is ergodic then we can use the sample mean to estimate the mean, and 
also use the correlogram to estimate the theoretical auto-correlation function, which then 
by Levy's philosophy tells us everything of interest about the stochastic process. 

Linear systems are the opposite of ergodic. In fact, very few physical systems are 
known to be ergodic. In the 60's, Sinai proved that a system of an ideal billiard ball was 
ergodic. In 1941, Oxtoby and Ulam proved that 'most' dynamical systems are ergodic. 
Nevertheless, there is no proof that, e.g., the dynamical system of the weather or coin 
tossing is ergodic. 

Khintchine [9] in 1943 proved that if R{t) ^ as t ^ oo, then / is ergodic, meaning 
that the above equation holds for almost all initial conditions. But R is not (p, and in 
particular it can not be observed directly and it depends on the choice of fx the probability 
measure. For linear systems, R is quasi-periodic and so is (p. 

Statistical Mechanics considers dynamical systems in which the number of degrees of 
freedom is very large. Ocean waves can either be modelled by a non-linear wave equa- 
tion such as Navier-Stokes, with a small number of degrees of freedom, or by a linear 
Hamiltonian mechanics, at the molecular level, with an astronomical number of degrees of 
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freedom. Even better, it considers a family of 'similar' dynamical systems parametrised 
by the number of degrees of freedom and considers various limits as n ^ oo. These limits 
include the traditional 'thermodynamic limit' but as Balian and others have argued, one 
can define many different types of such limits. 

Khintchine observed that since the number of degrees of freedom is very large, asymp- 
totic formulae for the quantities of interest should be obtainable by means of the methods 
of probability theory, especially its limit theorems. In particular, he observed that ergod- 
icity in itself was not of central importance since it was asking for too much to have exact 
equalities of time averages and phase averages for all measurable functions. It would suffice 
to have relations for some physically significant observables which hold asymptotically as 
the number of degrees of freedom goes to infinity. In 1943 he published vague but profound 
and visionary conjectures in this regard, but was unable to establish them in more than a 
few special cases and even then only with the help of the assumption that = 1, for which 
he has been much criticised. 

Khintchine's conjectures 

For a family of dynamical systems M^, parametrised by their (increasing) number 
of degrees of freedom, and representing in some sense 'the same physics', and for certain 
physically significant quantities, each one represented by an an observable for each 
Mn, again in some sense being 'the same' as n increases, Khintchine conjectured that 
would become approximately ergodic for n sufficiently large. Ruelle [13] and Lanford [8] 
were able to make some progress on this for systems with weak and short-range interac- 
tions and for observables that were some sort of average over the entire system, similar 
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to the thermodynamic quantities such as temperature. Yet Brownian motion is a very- 
well-known ergodic stochastic process which does not at all fit into this framework: the 
momentum of one particle becomes, as the number of other particles increases without 
bound, a stationary stochastic process closely related to Brownian motion, known as the 
Ornstein-Uhlenbeck process. Since it is the momentum of only one particle, it is not a 
thermodynamic quantity nor do the methods of Khintchine-Ruelle-Lanford apply. 
The Gibbs Program and Brownian Motion 

In 1965 Ford-Kac-^Mazur [2] showed how Brownian motion could arise in the limit 
of a sequence of explicit Hamiltonian systems. Their procedure was a model of carrying 
out the program envisioned by Willard Gibbs as long ago as 1900. The breakthrough 
was to allow a very violent, long-range interaction between the particles, one so violent 
as to require a kind of renormalisation in the limit. (In 1961 Schwinger [14] published a 
very interesting quantum precursor of this. Indeed, Schwinger's set-up involved a negative 
temperature amplifier which amplified quantum motion into a classical stochastic process.) 
This successfully carried out Gibbs's program for statistical mechanics for this concrete 
example, certainly one of central importance, and all the more striking since each system 
was linear but the limit stochastic process was ergodic. But they imposed a probability 
distribution (as did Gibbs himself, and in this respect was criticised by Khintchine) , that of 
Maxwell-Boltzmann, on the dynamical systems by fiat, and did not address Khintchine's 
conjectures. 

Their results ought to be robust in the choice of probability distribution and the 
choice of interaction. Students of Kac, Kim [11], and others have pursued this question of 
robustness. 
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The key result of Ford-Kac-Mazur is as follows: fix a temperature T. Put the cor- 
responding Maxwell-Boltzmann probability distribution on the space of initial conditions 
•[^2n+2 'Yhe.n there exists a family of matrices each one giving a linearly coupled sys- 
tem of harmonic oscillators with Hamiltonian such that, with the appropriate cut-off 
in the interaction to avoid singularities, 

R{t) e-'^l^l 

as n — > oo where R is the theoretical (phase) auto-correlation function of po, the momentum 
of the zero*'* particle (both of which depend on n), and d is a constant. Since one knows 
the limit of the theoretical auto-correlation functions, then, by Levy's philosophy, one 
knows which stochastic process (up to equivalence) ought to be considered the limit of 
these processes. This is all the more striking since the coupling constants, the entries of 
the matrices A^-, do not possess a limit, but instead grow without bound. 
Yet for any finite n, Poit) is quasi-periodic: 

where the A^^^ depend on Hence so is R^, and so is Hence R^ -/-^ as r — )■ oo 
for any n. We cannot interchange the limits in n and r. 

A sequence of dynamical systems 
For future use, we introduce some common notation. For / any function on the phase 
space 1] of a dynamical system, let jt denote the function composed with the fiow on the 
system for t units of time. Let (/) — J^f{u)diJ, whatever the invariant measure d/j,. Let 
{f)t = limr_).oo ^ Jq ft{^)dt which implicitly depends on a; e f2 although this will usually 
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be suppressed in the notation. The point is to investigate when {f)t = (/) approximately 
for almost all uj or at least the overwhelming majority of uj. 

Ford-Kac-Mazur introduced a stylised model [2] of Brownian motion, which consists 
of a mote whose canonical co-ordinates are po-, Qo, and n particles of the same mass. We are 
going to assume n — 2N is even and let all vector and matrix indices run from —N to A^. 
The canonical co-ordinates of the i^^ harmonic oscillator are Pi, qi. Let the Hamiltonian 
of this system be and write 



2m 2 

i=—N 



q-N+1 



\ qN / 

where A is a symmetric n + 1— square real matrix with positive eigenvalues. 
The trajectories of this flow satisfy (where A depend implicitly on n.) 

p{t) = cos{A^/H) •p(O) - A^/'^ sm{A^/H) ■ q{0). 

We focus on po{t). If the particles are all alike, it is natural to assume the matrix A is 
what is called, 'cyclic'. Each row is the previous one shifted over by one. The eigenvalues 
of A, uf, satisfy 

1 ^ r- 

^ -N 

where i ^ \J —\. This is obviously symmetric if we make a simple assumption on the w^'s. 

^7^ 

Let C = e2Jv+i . It is a classical fact about cyclic matrices that 

N 



(cos^^t)^„=^ V cos(a;,T)C(") 



This formula is an expression of the fact that the vectors (there are 2A'" + 1 of them as 
i runs from — to N) -^=^(C*-?)j=_jv,...Ar are a normal basis of eigenvectors of A with 
eigenvalues . 

12 



This formula holds for A'2 sin A^t as well . (We omit the proofs of facts about cyclic 
matrices, which may be found in Gerhard Kowalewski, Determinantentheorie (as cited in 
Ford-Kac-Mazur [2]). Hence (here and elsewhere, all indices run from —N to N) 



Po{t) = ^ 

n + 



\ k i k i ) 



We put p{k) — Q ^ Pi{0) and similarly for g and rewrite the sums above as 
^ ] E^*^^) cos(a;fct) -^q{k)ujksm{ujkt) > . 

K. k k ) 

Define the auto- correlation (sometimes called the auto-covariance) function of this 

trajectory by (pniT) = {po{t)Po{t + t))^. For each r, (pniT) is a physical observable on 

Mn. (As T varies, we have a uniform family of physical observables.) Just as all the M„ 

have, intuitively, the same physical meaning, so too these observables (or, rather, uniform 

families of observables) all have the 'same physical meaning'. Intuitively, each one measures 

how 'random' the trajectory (through a given point of phase space) is. It is a descriptive 

statistic of a definite set of data: whatever the initial conditions 'really are', the data is the 

future path of the trajectory (or the past, it makes no difference), and this is a deterministic 

descriptive statistic. The main result of this paper is to make rigorous the notion that in 

the limit as n — > oo, all normal trajectories have the same autocorrelation function, the 

Markoffian one which represents maximal randomness possible in this situation. 

But since this intuitive notion of limit is problematic, we make this notion rigorous 

by talking about normal cells for finite n. For fixed n, we can define a normal trajectory 
as being one which (within certain limits of approximation) has the same auto-correlation 
function as all other normal trajectories, viz., the one which is the best possible approxi- 
mation to the Markoffian exponential decay. 
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In general, the physicist Sir James Jeans outhned a common-sensical view of the 
foundations of Statistical Mechanics which has had an influence on his followers, Darwin 
and Fowler, and through them, on Khintchine, but has not penetrated as fully into the 
consciousness of philosophers of science [16] as it deserves (although the Ehrenfests consider 
it carefully in [1]). Statistical Mechanics is defined as the study of the statistical properties 
of the normal trajectory. By statistical is meant, descriptive statistics, so the notion of 
probability does not enter into this definition. We, following Wiener, will mean the auto- 
correlation function {f)t = (fi{t) of a trajectory. The important thing is to define normal. 
Jeans [4] defined a normal property of a state (or trajectory) to be a property which is 
possessed by the overwhelming majority of states in the system, so that, as the number of 
degrees of freedom increases without bound, the states which do not possess that property 
possess negligible Liouville measure. A state is then defined as a normal state if it possesses 
all those normal properties 'of which it is capable.' This definition of normal was not given 
with full logical rigour: the task of this paper is to fix that in an important example. 

Some random finite trigonometric sums 

Normalise the measure on the surface of constant energy, Ai e, inherited from Liouville 

measure to be total mass unity. In this paper the only properties we are concerned with 
are (p{t). That is, a normal cell is a sequence of subsets N'n of {Mn)E„ such that: for every 
choice of three positive epsilons, we have for n sufficiently large that it has measure 1 — e 
and the (p{t) are within ei of each other for t < l/e^- The energy level is defined for 
traditional reasons, and to make the comparison with traditional results convenient, to be 
that energy level which is most probable according to the Maxwell distribution: it is 

Intuitively, this would mean that the limits of the auto-correlation functions (uniform 
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convergence on compact sets) are the same for points in the same cell. 

The existence of a normal cell is the kind of approximate ergodicity analogous to 
what Khintchine envisioned, (as is, in a very different way, the dispersion theorem proved 
by Khintchine and Lanford). It is well-known that the limit stochastic process Ford- 
Kac-Mazur constructed from this sequence is ergodic, since it is the Ornstein-Uhlenbeck 
process. 

It is elementary that in general all for any trigonometric sum 

Po{t) = cos{uJit) + hi sui{(jjit), 

i 

(here, we may and do assume all Ui > 0) the auto-correlation function is 

i 

Applying this result to our formula for po, we obtain 

'^(^) = E ^ {\m\^+\^km\^)cos{ujkt). 

In order to show that normal cells exist, we want to show that the measure of initial 
conditions in phase space which yield approximately the same <^(t) tends to unity as 
n ^ oo. We may regard this as a random trigonometric sum. (Regarding the Pi{0) and 
the qi{0) as random variables). We show that the variance of <^(t) is negligible for large 
n. In 1866 (see Stroock [15] p. 77) Mehler proved that for + X2 + ■ ■ ■X'^ = pn and 
the uniform surface area measure on the surface of this sphere of total mass one, then as 
n — > 00, a;i tends weakly to a Gaussian random variable with mean zero and standard 
deviation p. The rate of convergence can be controlled explicitly, this is merely a concrete 
calculation of surface areas on spheres. 
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It is obvious that the coordinates are uncorrelated, that each Xi is perfectly un- 
correlated with the x'j for j i, and that the squares are negatively correlated with 
each other. Hence Var{x^) is approximately two, and Var{x'^ + x'j) < Var{x^) + Var{x'j). 

We wish to estimate the variances of the p{k) and the u)kq{k). 

Because H(j), q) — H{p, q) and, more precisely, because 

/q{k)\ 



(g_A,(0),...gAr(0))-A 



/ Q-n{0) \ 
g_iv+i(0) 

V qN{0) I 



(g(fc),...g(/c))(r'"')-A-(C 



m 

\q{k)^ 



(this is because the matrix {C"^'')im is the change of basis matrix that diagonalises A) it 
follows that 



{q-N{0),...qNm-A 



( 9-iv(0) \ 
g_Ar+i(0) 



V gAr(O) / 

But 2E - Yjh^^^y^k = EiPi(O)^- Hence Efe^(^)^'^fe ^"^^ EjPi(O)^ are perfectly 
anti-correlated and hence have equal variances. But this last is bounded by 2(2N+1). 
Now the Wfc are all real and positive in the applications we have in mind for later. Also, 
since the matrix is symmetric, = cj^. Hence, obviously, we may arrange that the 



are real without altering the auto-correlation function (since g(A;) = g(— A;)). Hence 



Since, by definition, p(/s) = C *'^Pi(0)) we have that ^^j^j^^^ is related to Pi(0) by 
a unitary transformation, the sum of squares of the moduli of the coordinates does not 
change. Then neither does their variance. Hence the variance of ^^^j^^)'^ is bounded by 

2(2iV+l)i _ \ 
(2Ar+l)^ ~ (2JV+1) • 
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This very weak ergodicity does not depend on any properties of the Uk except that 
the matrix A is cyclic and symmetric. It holds even when the dynamics does not tend 
towards Brownian motion (which only happens for a very specific choice of Wfe). 

We wish to relate Wiener's time auto-correlation function, a deterministic concept, to 
the phase auto-correlation function, at least when the trajectory is normal. Note first that 
as usual, the Maxwell distribution 'bunches up' for very large n around the most probable 
energy value, and thus the average taken with respect to the Maxwell distribution over the 
entire, unbounded, phase space is the same as the average over one energy level ellipsoid, 
with respect to the uniform distribution. An elementary part of what Khintchine proved 
[9] p. 68, is 

Theorem (Khintchine): Let djiE be the normalised measure on the constant energy shell 
fifi of Mn with energy level H = E inherited from Liouville measure. Then dfiE is 
invariant under the fiow. Let ^^{r) be Wiener's time-autocorrelation function for a given 
initial condition u ^ Q,e- Then the expectation of the time-autocorrelation function is 
equal to the phase auto-correlation function, i.e., 



Time auto-correlation functions of the Ornstein— Uhlenbeck process 

We now specify a precise dynamics by explicitly choosing the uju- Let ujs — tan 
Ford-Kac-Mazur calculate the phase auto-correlation of each finite stage and pass to 

the limit (with a cut-off renormalisation) obtaining the usual auto-correlation function of 

the Ornstein-Uhlenbeck process, Tre"''^'. 

It is part of what they prove that for any compact set K = {r e [0, K]}, there exists 
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an N so large that 

/ Po{0){Po)Tdl^{i^) 

approximates to Tre"!"^' to any desired accuracy. 

Their method of proof was relatively elementary and can be included here. A simple 
trick changes this into a standard cosine transform which can be looked up in any table 
of integrals. Let u = tan^. Then 9 = arctanw. This integral is, then, equal to one of the 
Riemann sums for the improper integral 

This, as aimed at, is the cosine transform of the bump function ^2^1 ■ It is equal to 7re~'^ 
when T is positive, but it is symmetric since cosine is an even function, so it is equal to 
7re~l'^l. There is no problem with convergence in this calculation as we let e — )■ 0; the 
improper integral is very nicely behaved. 

Theorem: Suppose given e,S > and K. Then there exists an N so large that the 
measure of the set of trajectories such that \(p{r) — Tre"!"^!! > e for all r e -fC is less than 
5. (This implies that a normal cell exists, the cell of all trajectories such that on K, their 
auto-correlation functions are within e of the phase auto- correlation function {fofr) •) 
Proof. Since the variance of <^(t) for any fixed r is less than (^2N+i) ' Tchebycheff's 
inequality gives us that the measure of the set of trajectories such that (fi{T) differs by 
more than e from {fofr), which is its expected value, is less than If we cover the 
interval K with a uniform mesh of width Aa; then there are values of Tj in this mesh. 
If we treat the measure of the set of trajectories yielding a deviation of (p{ri) by more 
than e from its expectation for each i as independent events, which they are not, then the 
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measure of the set of initial conditions such that even one of these violations will occur is 
less than ^ . 

As n grows, the difference between the quasi-periodic expectation of <^(t) and its 
limit the exponential decay Tre"''^', may be arranged to be less than e on any compact set, 
especially K. And this latter function clearly satisfies \Ay\ < Ax over this or any other 
mesh. Hence, if r varies over a region of width Ax, <f{T) varies by less than 2e + Ax. We 
may take Ax = .be. and n > to get a 4e proof. Q.E.D. 

This method yields an essentially independent proof of the results of Ford-Kac-Mazur 
and their generalisations. It does not seem as if the usual methods of large-sample theory of 
time series, or of Khintchine, Ruelle, or Lanford, or the usual limit theorems of probability 
theory, can be used to obtain this or similar results. I would like to conjecture that normal 

cells in this sense exist for a much wider class of sequences of dynamical systems. 

Now for all practical purposes, a stochastic process can be replaced by its auto- 
correlation function. In fact, a Gaussian stationary stochastic process is determined up 
to equivalence by the phase auto-correlation function. As n increases without bound, the 
time auto-correlation functions of normal trajectories approaches a limiting function. We 
may define a stochastic process by the requirement that its phase auto-correlation function 
be this limiting function. Thus we have defined the thermodynamic limit of this sequence 
of dynamical systems as a stochastic process. The sample space of this process has nothing 
to do with any Hamiltonian dynamical system or Liouville measure. There does not seem 
to me to be any point in trying to define a new class of dynamical system, with an infinite 
number of degrees of freedom, which would be the limit object here: this would go against 
Levy's philosophy, and would be subject to the use of Occam's razor. 
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In [5] and [6] , I have shown how a quantum analogue at negative temperature, which 
is much simpler than the classical case, has many of the same features as the model of this 
paper. It would be important to generalise the results of this paper to a negative tempera- 
ture heat bath around the mote. Schwinger in [14] has treated the case of quantum negative 
temperature Brownian motion, claimed it acts as an amplifier (which is understandable, 
since it is done in [5]), and claimed that it amplifies quantum motion to the classical level. 
The derivation lacks rigour and uses the usual imprecise notions of probability. This is an 
important topic for the future. 
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